On the Shortest Common Superstring of NGS Reads

نویسندگان

  • Tristan Braquelaire
  • Marie Gasparoux
  • Mathieu Raffinot
  • Raluca Uricaru
چکیده

The Shortest Superstring Problem (SSP) consists, for a set of strings S = {s1, · · · , sn}, to find a minimum length string that contains all si, 1 ≤ i ≤ k, as substrings. This problem is proved to be NP-Complete and APX-hard. Guaranteed approximation algorithms have been proposed, the current best ratio being 2 11 23 , which has been achieved following a long and difficult quest. However, SSP is highly used in practice on next generation sequencing (NGS) data, which plays an increasingly important role in sequencing. In this note, we show that the SSP approximation ratio can be improved on NGS reads by assuming specific characteristics of NGS data that are experimentally verified on a very large sampling set.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

انتخاب کوچکترین ابر رشته در DNA با استفاده از الگوریتم ازدحام ذرّات

A DNA string can be supposed a very long string on alphabet with 4 letters. Numerous scientists attempt in decoding of this string. since this string is very long , a shorter section of it that have overlapping on each other will be decoded .There is no information for the right position of these sections on main DNA string. It seems that the shortest string (substring of the main DNA string) i...

متن کامل

On Reoptimization of the Shortest Common Superstring Problem

In general, a reoptimization gives us a possibility to obtain a solution for a larger instance from a solution for a smaller instance. In this paper, we consider a possibility of usage of a reoptimization to solve the shortest common superstring problem.

متن کامل

The Shortest Common Superstring Problem

We consider the problem of the shortest common superstring. We describe an approach to solve the problem. This approach is based on an explicit reduction from the problem to the satisfiability problem.

متن کامل

Approximating the Shortest Superstring Problem Using de Bruijn Graphs

The best known approximation ratio for the shortest superstring problem is 2 11 23 (Mucha, 2012). In this note, we improve this bound for the case when the length of all input strings is equal to r, for r ≤ 7. For example, for strings of length 3 we get a 1 1 3 -approximation. An advantage of the algorithm is that it is extremely simple both to implement and to analyze. Another advantage is tha...

متن کامل

Approximating Shortest Superstring Problem Using de Bruijn Graphs

The best known approximation ratio for the shortest superstring problem is 2 11 23 (Mucha, 2012). In this note, we improve this bound for the case when the length of all input strings is equal to r, for r ≤ 7. E.g., for strings of length 3 we get a 1 1 3 -approximation. An advantage of the algorithm is that it is extremely simple both to implement and to analyze. Another advantage is that it is...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017